Optimization for latency reduction in Product Quantization #397
+76
−11
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR optimizes the io.github.jbellis.jvector.quantization.KMeansPlusPlusClusterer#chooseInitialCentroids implementation using Java Vector API for the following cases.
Tested the optimized code with the setup as below and observed latency reduction of ~6% in Product Quantization for ada002-100k dataset execution.
Run Setup:
Jvector release used: 4.0.0-beta.1
JDK used:
openjdk version "23.0.1" 2024-10-15
OpenJDK Runtime Environment (build 23.0.1+11-39)
OpenJDK 64-Bit Server VM (build 23.0.1+11-39, mixed mode, sharing)
Socket 1 of m7i.metal-48xl machine used
To check the benefit of this optimization reasonably, following changes were done in Bench.java, PQParameters.java and Grid.java:
io/github/jbellis/jvector/example/Grid.java:141 --> /*indexes.forEach((features, index) -> { ...});*/